Overview

Dataset statistics

Number of variables19
Number of observations3066766
Missing cells358715
Missing cells (%)0.6%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory565.6 MiB
Average record size in memory193.4 B

Variable types

Categorical5
DateTime2
Numeric11
Boolean1

Alerts

RatecodeID is highly overall correlated with tolls_amountHigh correlation
VendorID is highly overall correlated with extraHigh correlation
congestion_surcharge is highly overall correlated with improvement_surchargeHigh correlation
extra is highly overall correlated with VendorIDHigh correlation
fare_amount is highly overall correlated with total_amount and 1 other fieldsHigh correlation
improvement_surcharge is highly overall correlated with congestion_surchargeHigh correlation
tip_amount is highly overall correlated with total_amountHigh correlation
tolls_amount is highly overall correlated with RatecodeIDHigh correlation
total_amount is highly overall correlated with fare_amount and 2 other fieldsHigh correlation
trip_distance is highly overall correlated with fare_amount and 1 other fieldsHigh correlation
store_and_fwd_flag is highly imbalanced (94.2%)Imbalance
payment_type is highly imbalanced (59.0%)Imbalance
improvement_surcharge is highly imbalanced (96.1%)Imbalance
congestion_surcharge is highly imbalanced (71.7%)Imbalance
airport_fee is highly imbalanced (72.2%)Imbalance
passenger_count has 71743 (2.3%) missing valuesMissing
RatecodeID has 71743 (2.3%) missing valuesMissing
store_and_fwd_flag has 71743 (2.3%) missing valuesMissing
congestion_surcharge has 71743 (2.3%) missing valuesMissing
airport_fee has 71743 (2.3%) missing valuesMissing
trip_distance is highly skewed (γ1 = 810.4075091)Skewed
mta_tax is highly skewed (γ1 = 35.31543467)Skewed
passenger_count has 51164 (1.7%) zerosZeros
trip_distance has 45862 (1.5%) zerosZeros
extra has 1240718 (40.5%) zerosZeros
tip_amount has 694757 (22.7%) zerosZeros
tolls_amount has 2840307 (92.6%) zerosZeros

Reproduction

Analysis started2026-01-08 22:53:18.550213
Analysis finished2026-01-08 22:57:00.277294
Duration3 minutes and 41.73 seconds
Software versionydata-profiling vv4.18.0
Download configurationconfig.json

Variables

VendorID
Categorical

High correlation 

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size146.2 MiB
2
2239399 
1
827367 

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters3066766
Distinct characters2
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row2
2nd row2
3rd row2
4th row1
5th row2

Common Values

ValueCountFrequency (%)
22239399
73.0%
1827367
 
27.0%

Length

2026-01-08T22:57:00.397969image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2026-01-08T22:57:00.489988image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
ValueCountFrequency (%)
22239399
73.0%
1827367
 
27.0%

Most occurring characters

ValueCountFrequency (%)
22239399
73.0%
1827367
 
27.0%

Most occurring categories

ValueCountFrequency (%)
Decimal Number3066766
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
22239399
73.0%
1827367
 
27.0%

Most occurring scripts

ValueCountFrequency (%)
Common3066766
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
22239399
73.0%
1827367
 
27.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII3066766
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
22239399
73.0%
1827367
 
27.0%
Distinct1610975
Distinct (%)52.5%
Missing0
Missing (%)0.0%
Memory size23.4 MiB
Minimum2008-12-31 23:01:42
Maximum2023-02-01 00:56:53
Invalid dates0
Invalid dates (%)0.0%
2026-01-08T22:57:00.595955image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2026-01-08T22:57:00.769372image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
Distinct1611319
Distinct (%)52.5%
Missing0
Missing (%)0.0%
Memory size23.4 MiB
Minimum2009-01-01 14:29:11
Maximum2023-02-02 09:28:47
Invalid dates0
Invalid dates (%)0.0%
2026-01-08T22:57:00.909222image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2026-01-08T22:57:01.047703image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
Histogram with fixed size bins (bins=50)

passenger_count
Real number (ℝ)

Missing  Zeros 

Distinct10
Distinct (%)< 0.1%
Missing71743
Missing (%)2.3%
Infinite0
Infinite (%)0.0%
Mean1.3625321
Minimum0
Maximum9
Zeros51164
Zeros (%)1.7%
Negative0
Negative (%)0.0%
Memory size23.4 MiB
2026-01-08T22:57:01.157350image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile1
Q11
median1
Q31
95-th percentile3
Maximum9
Range9
Interquartile range (IQR)0

Descriptive statistics

Standard deviation0.89611997
Coefficient of variation (CV)0.65768724
Kurtosis9.5142525
Mean1.3625321
Median Absolute Deviation (MAD)0
Skewness2.8753862
Sum4080815
Variance0.80303101
MonotonicityNot monotonic
2026-01-08T22:57:01.244445image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
Histogram with fixed size bins (bins=10)
ValueCountFrequency (%)
12261400
73.7%
2451536
 
14.7%
3106353
 
3.5%
453745
 
1.8%
051164
 
1.7%
542681
 
1.4%
628124
 
0.9%
813
 
< 0.1%
76
 
< 0.1%
91
 
< 0.1%
(Missing)71743
 
2.3%
ValueCountFrequency (%)
051164
 
1.7%
12261400
73.7%
2451536
 
14.7%
3106353
 
3.5%
453745
 
1.8%
542681
 
1.4%
628124
 
0.9%
76
 
< 0.1%
813
 
< 0.1%
91
 
< 0.1%
ValueCountFrequency (%)
91
 
< 0.1%
813
 
< 0.1%
76
 
< 0.1%
628124
 
0.9%
542681
 
1.4%
453745
 
1.8%
3106353
 
3.5%
2451536
 
14.7%
12261400
73.7%
051164
 
1.7%

trip_distance
Real number (ℝ)

High correlation  Skewed  Zeros 

Distinct4387
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean3.847342
Minimum0
Maximum258928.15
Zeros45862
Zeros (%)1.5%
Negative0
Negative (%)0.0%
Memory size23.4 MiB
2026-01-08T22:57:01.361070image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0.5
Q11.06
median1.8
Q33.33
95-th percentile14.32
Maximum258928.15
Range258928.15
Interquartile range (IQR)2.27

Descriptive statistics

Standard deviation249.58376
Coefficient of variation (CV)64.871736
Kurtosis726436.93
Mean3.847342
Median Absolute Deviation (MAD)0.9
Skewness810.40751
Sum11798898
Variance62292.051
MonotonicityNot monotonic
2026-01-08T22:57:01.510401image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
045862
 
1.5%
143827
 
1.4%
0.943473
 
1.4%
1.142578
 
1.4%
0.841801
 
1.4%
1.241147
 
1.3%
1.339793
 
1.3%
0.738108
 
1.2%
1.437286
 
1.2%
1.535544
 
1.2%
Other values (4377)2657347
86.6%
ValueCountFrequency (%)
045862
1.5%
0.012137
 
0.1%
0.021419
 
< 0.1%
0.031156
 
< 0.1%
0.04862
 
< 0.1%
0.05697
 
< 0.1%
0.06565
 
< 0.1%
0.07582
 
< 0.1%
0.08471
 
< 0.1%
0.09448
 
< 0.1%
ValueCountFrequency (%)
258928.151
< 0.1%
225987.371
< 0.1%
187872.331
< 0.1%
116439.711
< 0.1%
85543.661
< 0.1%
76886.521
< 0.1%
62359.521
< 0.1%
52042.31
< 0.1%
33205.321
< 0.1%
16562.611
< 0.1%

RatecodeID
Real number (ℝ)

High correlation  Missing 

Distinct7
Distinct (%)< 0.1%
Missing71743
Missing (%)2.3%
Infinite0
Infinite (%)0.0%
Mean1.4974396
Minimum1
Maximum99
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size23.4 MiB
2026-01-08T22:57:01.613995image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile1
Q11
median1
Q31
95-th percentile2
Maximum99
Range98
Interquartile range (IQR)0

Descriptive statistics

Standard deviation6.4747667
Coefficient of variation (CV)4.3238918
Kurtosis222.02956
Mean1.4974396
Median Absolute Deviation (MAD)0
Skewness14.943792
Sum4484866
Variance41.922604
MonotonicityNot monotonic
2026-01-08T22:57:01.729677image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
Histogram with fixed size bins (bins=7)
ValueCountFrequency (%)
12839305
92.6%
2114239
 
3.7%
515043
 
0.5%
9913106
 
0.4%
38958
 
0.3%
44366
 
0.1%
66
 
< 0.1%
(Missing)71743
 
2.3%
ValueCountFrequency (%)
12839305
92.6%
2114239
 
3.7%
38958
 
0.3%
44366
 
0.1%
515043
 
0.5%
66
 
< 0.1%
9913106
 
0.4%
ValueCountFrequency (%)
9913106
 
0.4%
66
 
< 0.1%
515043
 
0.5%
44366
 
0.1%
38958
 
0.3%
2114239
 
3.7%
12839305
92.6%

store_and_fwd_flag
Boolean

Imbalance  Missing 

Distinct2
Distinct (%)< 0.1%
Missing71743
Missing (%)2.3%
Memory size5.8 MiB
False
2975020 
True
 
20003
(Missing)
 
71743
ValueCountFrequency (%)
False2975020
97.0%
True20003
 
0.7%
(Missing)71743
 
2.3%
2026-01-08T22:57:01.807454image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/

PULocationID
Real number (ℝ)

Distinct257
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean166.39805
Minimum1
Maximum265
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size23.4 MiB
2026-01-08T22:57:01.908353image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile48
Q1132
median162
Q3234
95-th percentile261
Maximum265
Range264
Interquartile range (IQR)102

Descriptive statistics

Standard deviation64.244131
Coefficient of variation (CV)0.38608705
Kurtosis-0.86450402
Mean166.39805
Median Absolute Deviation (MAD)62
Skewness-0.25597836
Sum5.1030387 × 108
Variance4127.3083
MonotonicityNot monotonic
2026-01-08T22:57:02.051247image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
132160030
 
5.2%
237148074
 
4.8%
236138391
 
4.5%
161135417
 
4.4%
186109227
 
3.6%
162105334
 
3.4%
142100228
 
3.3%
23098991
 
3.2%
13889188
 
2.9%
17088346
 
2.9%
Other values (247)1893540
61.7%
ValueCountFrequency (%)
1410
 
< 0.1%
22
 
< 0.1%
339
 
< 0.1%
43649
0.1%
556
 
< 0.1%
648
 
< 0.1%
71510
< 0.1%
811
 
< 0.1%
944
 
< 0.1%
101356
 
< 0.1%
ValueCountFrequency (%)
2651647
 
0.1%
26440116
1.3%
26366128
2.2%
26243760
1.4%
26112842
 
0.4%
260640
 
< 0.1%
25974
 
< 0.1%
25870
 
< 0.1%
25769
 
< 0.1%
256967
 
< 0.1%

DOLocationID
Real number (ℝ)

Distinct261
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean164.39263
Minimum1
Maximum265
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size23.4 MiB
2026-01-08T22:57:02.197528image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile43
Q1114
median162
Q3234
95-th percentile262
Maximum265
Range264
Interquartile range (IQR)120

Descriptive statistics

Standard deviation69.943682
Coefficient of variation (CV)0.42546726
Kurtosis-0.92636707
Mean164.39263
Median Absolute Deviation (MAD)69
Skewness-0.36632366
Sum5.0415373 × 108
Variance4892.1186
MonotonicityNot monotonic
2026-01-08T22:57:02.813496image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
236146348
 
4.8%
237132364
 
4.3%
161116149
 
3.8%
23089878
 
2.9%
17088783
 
2.9%
23987969
 
2.9%
14287969
 
2.9%
14187655
 
2.9%
16282739
 
2.7%
4877383
 
2.5%
Other values (251)2069529
67.5%
ValueCountFrequency (%)
17526
0.2%
223
 
< 0.1%
3198
 
< 0.1%
412165
0.4%
556
 
< 0.1%
682
 
< 0.1%
79434
0.3%
845
 
< 0.1%
9259
 
< 0.1%
104210
 
0.1%
ValueCountFrequency (%)
26510958
 
0.4%
26422591
 
0.7%
26369319
2.3%
26251502
1.7%
26112427
 
0.4%
2602343
 
0.1%
259366
 
< 0.1%
258652
 
< 0.1%
2571238
 
< 0.1%
2566842
 
0.2%

payment_type
Categorical

Imbalance 

Distinct5
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size146.2 MiB
1
2411462 
2
532241 
0
 
71743
4
 
33297
3
 
18023

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters3066766
Distinct characters5
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row2
2nd row1
3rd row1
4th row1
5th row1

Common Values

ValueCountFrequency (%)
12411462
78.6%
2532241
 
17.4%
071743
 
2.3%
433297
 
1.1%
318023
 
0.6%

Length

2026-01-08T22:57:02.959178image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2026-01-08T22:57:03.058304image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
ValueCountFrequency (%)
12411462
78.6%
2532241
 
17.4%
071743
 
2.3%
433297
 
1.1%
318023
 
0.6%

Most occurring characters

ValueCountFrequency (%)
12411462
78.6%
2532241
 
17.4%
071743
 
2.3%
433297
 
1.1%
318023
 
0.6%

Most occurring categories

ValueCountFrequency (%)
Decimal Number3066766
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
12411462
78.6%
2532241
 
17.4%
071743
 
2.3%
433297
 
1.1%
318023
 
0.6%

Most occurring scripts

ValueCountFrequency (%)
Common3066766
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
12411462
78.6%
2532241
 
17.4%
071743
 
2.3%
433297
 
1.1%
318023
 
0.6%

Most occurring blocks

ValueCountFrequency (%)
ASCII3066766
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
12411462
78.6%
2532241
 
17.4%
071743
 
2.3%
433297
 
1.1%
318023
 
0.6%

fare_amount
Real number (ℝ)

High correlation 

Distinct6873
Distinct (%)0.2%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean18.367069
Minimum-900
Maximum1160.1
Zeros1110
Zeros (%)< 0.1%
Negative25049
Negative (%)0.8%
Memory size23.4 MiB
2026-01-08T22:57:03.178197image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/

Quantile statistics

Minimum-900
5-th percentile5.8
Q18.6
median12.8
Q320.5
95-th percentile65.3
Maximum1160.1
Range2060.1
Interquartile range (IQR)11.9

Descriptive statistics

Standard deviation17.807822
Coefficient of variation (CV)0.96955166
Kurtosis49.554467
Mean18.367069
Median Absolute Deviation (MAD)4.9
Skewness3.2212102
Sum56327502
Variance317.11852
MonotonicityNot monotonic
2026-01-08T22:57:03.332014image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
8.6149461
 
4.9%
9.3146821
 
4.8%
7.9146075
 
4.8%
10143521
 
4.7%
7.2139156
 
4.5%
10.7135232
 
4.4%
11.4128910
 
4.2%
6.5122739
 
4.0%
12.1120559
 
3.9%
70113028
 
3.7%
Other values (6863)1721264
56.1%
ValueCountFrequency (%)
-9001
< 0.1%
-7501
< 0.1%
-6501
< 0.1%
-6001
< 0.1%
-5801
< 0.1%
-5001
< 0.1%
-497.91
< 0.1%
-495.11
< 0.1%
-4801
< 0.1%
-425.81
< 0.1%
ValueCountFrequency (%)
1160.11
 
< 0.1%
9991
 
< 0.1%
9001
 
< 0.1%
7501
 
< 0.1%
701.61
 
< 0.1%
656.81
 
< 0.1%
655.351
 
< 0.1%
6501
 
< 0.1%
6251
 
< 0.1%
6003
< 0.1%

extra
Real number (ℝ)

High correlation  Zeros 

Distinct68
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean1.5378416
Minimum-7.5
Maximum12.5
Zeros1240718
Zeros (%)40.5%
Negative12407
Negative (%)0.4%
Memory size23.4 MiB
2026-01-08T22:57:03.472127image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/

Quantile statistics

Minimum-7.5
5-th percentile0
Q10
median1
Q32.5
95-th percentile5
Maximum12.5
Range20
Interquartile range (IQR)2.5

Descriptive statistics

Standard deviation1.7895925
Coefficient of variation (CV)1.1637041
Kurtosis2.1283702
Mean1.5378416
Median Absolute Deviation (MAD)1
Skewness1.2686481
Sum4716200.2
Variance3.2026412
MonotonicityNot monotonic
2026-01-08T22:57:03.620502image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
01240718
40.5%
2.5763716
24.9%
1564096
18.4%
5209329
 
6.8%
3.5172569
 
5.6%
623442
 
0.8%
7.521389
 
0.7%
3.7515389
 
0.5%
8.7512373
 
0.4%
1.259974
 
0.3%
Other values (58)33771
 
1.1%
ValueCountFrequency (%)
-7.5141
 
< 0.1%
-6256
 
< 0.1%
-5859
 
< 0.1%
-4.51
 
< 0.1%
-3.51
 
< 0.1%
-2.53757
 
0.1%
-1.256
 
< 0.1%
-17383
 
0.2%
-0.53
 
< 0.1%
01240718
40.5%
ValueCountFrequency (%)
12.52
 
< 0.1%
11.252387
 
0.1%
112
 
< 0.1%
10772
 
< 0.1%
9.753115
 
0.1%
9.451
 
< 0.1%
9.35
 
< 0.1%
9.255
 
< 0.1%
8.975
 
< 0.1%
8.7512373
0.4%

mta_tax
Real number (ℝ)

Skewed 

Distinct10
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean0.48828998
Minimum-0.5
Maximum53.16
Zeros23421
Zeros (%)0.8%
Negative24501
Negative (%)0.8%
Memory size23.4 MiB
2026-01-08T22:57:03.759117image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/

Quantile statistics

Minimum-0.5
5-th percentile0.5
Q10.5
median0.5
Q30.5
95-th percentile0.5
Maximum53.16
Range53.66
Interquartile range (IQR)0

Descriptive statistics

Standard deviation0.1034641
Coefficient of variation (CV)0.21189069
Kurtosis21970.425
Mean0.48828998
Median Absolute Deviation (MAD)0
Skewness35.315435
Sum1497471.1
Variance0.01070482
MonotonicityNot monotonic
2026-01-08T22:57:03.862717image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
Histogram with fixed size bins (bins=10)
ValueCountFrequency (%)
0.53018062
98.4%
-0.524501
 
0.8%
023421
 
0.8%
0.8771
 
< 0.1%
44
 
< 0.1%
0.33
 
< 0.1%
1.61
 
< 0.1%
53.161
 
< 0.1%
1.091
 
< 0.1%
1.051
 
< 0.1%
ValueCountFrequency (%)
-0.524501
 
0.8%
023421
 
0.8%
0.33
 
< 0.1%
0.53018062
98.4%
0.8771
 
< 0.1%
1.051
 
< 0.1%
1.091
 
< 0.1%
1.61
 
< 0.1%
44
 
< 0.1%
53.161
 
< 0.1%
ValueCountFrequency (%)
53.161
 
< 0.1%
44
 
< 0.1%
1.61
 
< 0.1%
1.091
 
< 0.1%
1.051
 
< 0.1%
0.8771
 
< 0.1%
0.53018062
98.4%
0.33
 
< 0.1%
023421
 
0.8%
-0.524501
 
0.8%

tip_amount
Real number (ℝ)

High correlation  Zeros 

Distinct4036
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean3.3679407
Minimum-96.22
Maximum380.8
Zeros694757
Zeros (%)22.7%
Negative225
Negative (%)< 0.1%
Memory size23.4 MiB
2026-01-08T22:57:03.983993image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/

Quantile statistics

Minimum-96.22
5-th percentile0
Q11
median2.72
Q34.2
95-th percentile11.11
Maximum380.8
Range477.02
Interquartile range (IQR)3.2

Descriptive statistics

Standard deviation3.8267595
Coefficient of variation (CV)1.1362313
Kurtosis92.756546
Mean3.3679407
Median Absolute Deviation (MAD)1.72
Skewness4.2238312
Sum10328686
Variance14.644088
MonotonicityNot monotonic
2026-01-08T22:57:04.124060image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0694757
 
22.7%
2152040
 
5.0%
1132857
 
4.3%
376829
 
2.5%
542332
 
1.4%
2.840013
 
1.3%
3.534750
 
1.1%
1.534000
 
1.1%
433148
 
1.1%
2.132931
 
1.1%
Other values (4026)1793109
58.5%
ValueCountFrequency (%)
-96.221
 
< 0.1%
-90.091
 
< 0.1%
-64.661
 
< 0.1%
-51.891
 
< 0.1%
-35.031
 
< 0.1%
-33.931
 
< 0.1%
-204
< 0.1%
-14.51
 
< 0.1%
-13.91
 
< 0.1%
-12.811
 
< 0.1%
ValueCountFrequency (%)
380.81
 
< 0.1%
270.31
 
< 0.1%
222.211
 
< 0.1%
211.51
 
< 0.1%
2021
 
< 0.1%
201.651
 
< 0.1%
200.21
 
< 0.1%
2003
< 0.1%
1601
 
< 0.1%
1501
 
< 0.1%

tolls_amount
Real number (ℝ)

High correlation  Zeros 

Distinct776
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean0.51849066
Minimum-65
Maximum196.99
Zeros2840307
Zeros (%)92.6%
Negative1377
Negative (%)< 0.1%
Memory size23.4 MiB
2026-01-08T22:57:04.310128image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/

Quantile statistics

Minimum-65
5-th percentile0
Q10
median0
Q30
95-th percentile6.55
Maximum196.99
Range261.99
Interquartile range (IQR)0

Descriptive statistics

Standard deviation2.017579
Coefficient of variation (CV)3.8912543
Kurtosis78.69951
Mean0.51849066
Median Absolute Deviation (MAD)0
Skewness5.3893505
Sum1590089.5
Variance4.0706251
MonotonicityNot monotonic
2026-01-08T22:57:04.540586image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
02840307
92.6%
6.55207651
 
6.8%
12.751780
 
0.1%
31602
 
0.1%
14.751408
 
< 0.1%
-6.551165
 
< 0.1%
13.1993
 
< 0.1%
11.55868
 
< 0.1%
11.75863
 
< 0.1%
13.75725
 
< 0.1%
Other values (766)9404
 
0.3%
ValueCountFrequency (%)
-651
< 0.1%
-39.31
< 0.1%
-34.051
< 0.1%
-30.31
< 0.1%
-30.051
< 0.1%
-302
< 0.1%
-29.851
< 0.1%
-29.51
< 0.1%
-27.51
< 0.1%
-27.32
< 0.1%
ValueCountFrequency (%)
196.991
< 0.1%
92.751
< 0.1%
86.551
< 0.1%
81.551
< 0.1%
812
< 0.1%
781
< 0.1%
73.751
< 0.1%
701
< 0.1%
69.31
< 0.1%
651
< 0.1%

improvement_surcharge
Categorical

High correlation  Imbalance 

Distinct5
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size152.1 MiB
1.0
3035371 
-1.0
 
25117
0.3
 
5269
0.0
 
973
-0.3
 
36

Length

Max length4
Median length3
Mean length3.0082018
Min length3

Characters and Unicode

Total characters9225451
Distinct characters5
Distinct categories3 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row1.0
2nd row1.0
3rd row1.0
4th row1.0
5th row1.0

Common Values

ValueCountFrequency (%)
1.03035371
99.0%
-1.025117
 
0.8%
0.35269
 
0.2%
0.0973
 
< 0.1%
-0.336
 
< 0.1%

Length

2026-01-08T22:57:04.730542image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2026-01-08T22:57:04.871898image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
ValueCountFrequency (%)
1.03060488
99.8%
0.35305
 
0.2%
0.0973
 
< 0.1%

Most occurring characters

ValueCountFrequency (%)
03067739
33.3%
.3066766
33.2%
13060488
33.2%
-25153
 
0.3%
35305
 
0.1%

Most occurring categories

ValueCountFrequency (%)
Decimal Number6133532
66.5%
Other Punctuation3066766
33.2%
Dash Punctuation25153
 
0.3%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
03067739
50.0%
13060488
49.9%
35305
 
0.1%
Other Punctuation
ValueCountFrequency (%)
.3066766
100.0%
Dash Punctuation
ValueCountFrequency (%)
-25153
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common9225451
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
03067739
33.3%
.3066766
33.2%
13060488
33.2%
-25153
 
0.3%
35305
 
0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII9225451
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
03067739
33.3%
.3066766
33.2%
13060488
33.2%
-25153
 
0.3%
35305
 
0.1%

total_amount
Real number (ℝ)

High correlation 

Distinct15871
Distinct (%)0.5%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean27.020383
Minimum-751
Maximum1169.4
Zeros568
Zeros (%)< 0.1%
Negative25204
Negative (%)0.8%
Memory size23.4 MiB
2026-01-08T22:57:05.031355image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/

Quantile statistics

Minimum-751
5-th percentile10.92
Q115.4
median20.16
Q328.7
95-th percentile80.25
Maximum1169.4
Range1920.4
Interquartile range (IQR)13.3

Descriptive statistics

Standard deviation22.163589
Coefficient of variation (CV)0.82025443
Kurtosis26.595788
Mean27.020383
Median Absolute Deviation (MAD)5.78
Skewness2.8328288
Sum82865192
Variance491.22468
MonotonicityNot monotonic
2026-01-08T22:57:05.242572image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
16.848536
 
1.6%
12.645398
 
1.5%
2137923
 
1.2%
15.1226389
 
0.9%
15.9626375
 
0.9%
14.2824988
 
0.8%
18.4824939
 
0.8%
17.6424786
 
0.8%
1424305
 
0.8%
19.3224192
 
0.8%
Other values (15861)2758935
90.0%
ValueCountFrequency (%)
-7511
< 0.1%
-630.71
< 0.1%
-603.51
< 0.1%
-6011
< 0.1%
-583.51
< 0.1%
-520.551
< 0.1%
-497.851
< 0.1%
-4811
< 0.1%
-430.81
< 0.1%
-4011
< 0.1%
ValueCountFrequency (%)
1169.41
< 0.1%
10001
< 0.1%
9011
< 0.1%
7511
< 0.1%
705.61
< 0.1%
667.11
< 0.1%
656.851
< 0.1%
6511
< 0.1%
6261
< 0.1%
614.451
< 0.1%

congestion_surcharge
Categorical

High correlation  Imbalance  Missing 

Distinct3
Distinct (%)< 0.1%
Missing71743
Missing (%)2.3%
Memory size152.4 MiB
2.5
2744268 
0.0
 
231037
-2.5
 
19718

Length

Max length4
Median length3
Mean length3.0065836
Min length3

Characters and Unicode

Total characters9004787
Distinct characters5
Distinct categories3 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row2.5
2nd row2.5
3rd row2.5
4th row0.0
5th row2.5

Common Values

ValueCountFrequency (%)
2.52744268
89.5%
0.0231037
 
7.5%
-2.519718
 
0.6%
(Missing)71743
 
2.3%

Length

2026-01-08T22:57:05.437505image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2026-01-08T22:57:05.539368image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
ValueCountFrequency (%)
2.52763986
92.3%
0.0231037
 
7.7%

Most occurring characters

ValueCountFrequency (%)
.2995023
33.3%
22763986
30.7%
52763986
30.7%
0462074
 
5.1%
-19718
 
0.2%

Most occurring categories

ValueCountFrequency (%)
Decimal Number5990046
66.5%
Other Punctuation2995023
33.3%
Dash Punctuation19718
 
0.2%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
22763986
46.1%
52763986
46.1%
0462074
 
7.7%
Other Punctuation
ValueCountFrequency (%)
.2995023
100.0%
Dash Punctuation
ValueCountFrequency (%)
-19718
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common9004787
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
.2995023
33.3%
22763986
30.7%
52763986
30.7%
0462074
 
5.1%
-19718
 
0.2%

Most occurring blocks

ValueCountFrequency (%)
ASCII9004787
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
.2995023
33.3%
22763986
30.7%
52763986
30.7%
0462074
 
5.1%
-19718
 
0.2%

airport_fee
Categorical

Imbalance  Missing 

Distinct3
Distinct (%)< 0.1%
Missing71743
Missing (%)2.3%
Memory size152.6 MiB
0.0
2730456 
1.25
 
260960
-1.25
 
3607

Length

Max length5
Median length3
Mean length3.0895399
Min length3

Characters and Unicode

Total characters9253243
Distinct characters6
Distinct categories3 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0.0
2nd row0.0
3rd row0.0
4th row1.25
5th row0.0

Common Values

ValueCountFrequency (%)
0.02730456
89.0%
1.25260960
 
8.5%
-1.253607
 
0.1%
(Missing)71743
 
2.3%

Length

2026-01-08T22:57:05.688183image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2026-01-08T22:57:05.803737image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
ValueCountFrequency (%)
0.02730456
91.2%
1.25264567
 
8.8%

Most occurring characters

ValueCountFrequency (%)
05460912
59.0%
.2995023
32.4%
1264567
 
2.9%
2264567
 
2.9%
5264567
 
2.9%
-3607
 
< 0.1%

Most occurring categories

ValueCountFrequency (%)
Decimal Number6254613
67.6%
Other Punctuation2995023
32.4%
Dash Punctuation3607
 
< 0.1%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
05460912
87.3%
1264567
 
4.2%
2264567
 
4.2%
5264567
 
4.2%
Other Punctuation
ValueCountFrequency (%)
.2995023
100.0%
Dash Punctuation
ValueCountFrequency (%)
-3607
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common9253243
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
05460912
59.0%
.2995023
32.4%
1264567
 
2.9%
2264567
 
2.9%
5264567
 
2.9%
-3607
 
< 0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII9253243
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
05460912
59.0%
.2995023
32.4%
1264567
 
2.9%
2264567
 
2.9%
5264567
 
2.9%
-3607
 
< 0.1%

Interactions

2026-01-08T22:56:33.686140image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2026-01-08T22:55:29.309924image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2026-01-08T22:55:35.405127image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2026-01-08T22:55:40.990178image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2026-01-08T22:55:47.127761image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2026-01-08T22:55:57.495767image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2026-01-08T22:56:04.302391image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2026-01-08T22:56:09.797688image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2026-01-08T22:56:16.101820image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2026-01-08T22:56:21.655774image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2026-01-08T22:56:28.145944image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2026-01-08T22:56:34.203050image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2026-01-08T22:55:29.829361image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2026-01-08T22:55:36.122363image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2026-01-08T22:55:41.484280image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2026-01-08T22:55:48.010807image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2026-01-08T22:55:58.025450image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2026-01-08T22:56:04.791932image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2026-01-08T22:56:10.298403image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2026-01-08T22:56:16.592053image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2026-01-08T22:56:22.180113image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2026-01-08T22:56:28.633479image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2026-01-08T22:56:34.731922image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2026-01-08T22:55:30.316229image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2026-01-08T22:55:36.597464image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2026-01-08T22:55:41.951795image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2026-01-08T22:55:49.039523image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2026-01-08T22:55:58.560141image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2026-01-08T22:56:05.289466image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2026-01-08T22:56:10.833957image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2026-01-08T22:56:17.099874image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2026-01-08T22:56:22.733666image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2026-01-08T22:56:29.149256image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2026-01-08T22:56:35.255996image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2026-01-08T22:55:30.837579image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2026-01-08T22:55:37.065016image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2026-01-08T22:55:42.458951image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2026-01-08T22:55:49.910326image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2026-01-08T22:55:59.101971image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2026-01-08T22:56:05.759725image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2026-01-08T22:56:11.356947image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2026-01-08T22:56:17.601909image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2026-01-08T22:56:23.242273image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2026-01-08T22:56:29.633354image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2026-01-08T22:56:35.785893image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2026-01-08T22:55:31.356752image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2026-01-08T22:55:37.547347image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2026-01-08T22:55:42.946722image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2026-01-08T22:55:50.600737image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2026-01-08T22:55:59.641410image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2026-01-08T22:56:06.242771image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2026-01-08T22:56:11.868268image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2026-01-08T22:56:18.110858image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2026-01-08T22:56:23.785541image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2026-01-08T22:56:30.123941image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2026-01-08T22:56:36.334358image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2026-01-08T22:55:31.885036image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2026-01-08T22:55:38.045026image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2026-01-08T22:55:43.439956image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2026-01-08T22:55:53.649143image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2026-01-08T22:56:00.799879image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2026-01-08T22:56:06.713612image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2026-01-08T22:56:12.385590image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2026-01-08T22:56:18.638084image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2026-01-08T22:56:24.282904image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2026-01-08T22:56:30.620962image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2026-01-08T22:56:36.857000image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2026-01-08T22:55:32.396051image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2026-01-08T22:55:38.552583image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2026-01-08T22:55:43.917396image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2026-01-08T22:55:54.599097image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2026-01-08T22:56:01.456352image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2026-01-08T22:56:07.221132image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2026-01-08T22:56:13.000429image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2026-01-08T22:56:19.145734image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2026-01-08T22:56:25.066433image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2026-01-08T22:56:31.108217image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2026-01-08T22:56:37.408087image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2026-01-08T22:55:32.901196image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2026-01-08T22:55:39.036669image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2026-01-08T22:55:44.418128image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2026-01-08T22:55:55.210250image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2026-01-08T22:56:02.156594image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2026-01-08T22:56:07.741338image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2026-01-08T22:56:13.661496image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2026-01-08T22:56:19.636389image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2026-01-08T22:56:25.672839image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2026-01-08T22:56:31.614367image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2026-01-08T22:56:37.956906image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2026-01-08T22:55:33.487829image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2026-01-08T22:55:39.516388image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2026-01-08T22:55:44.899181image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2026-01-08T22:55:55.877041image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2026-01-08T22:56:02.789608image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2026-01-08T22:56:08.275555image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2026-01-08T22:56:14.287184image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2026-01-08T22:56:20.122596image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2026-01-08T22:56:26.254747image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2026-01-08T22:56:32.094842image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2026-01-08T22:56:38.678351image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2026-01-08T22:55:34.109430image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2026-01-08T22:55:40.014424image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2026-01-08T22:55:45.392543image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2026-01-08T22:55:56.432534image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2026-01-08T22:56:03.308984image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2026-01-08T22:56:08.783324image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2026-01-08T22:56:14.944228image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2026-01-08T22:56:20.623912image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2026-01-08T22:56:26.865298image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2026-01-08T22:56:32.566754image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2026-01-08T22:56:39.330358image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2026-01-08T22:55:34.740212image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2026-01-08T22:55:40.494481image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2026-01-08T22:55:46.072499image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2026-01-08T22:55:56.941798image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2026-01-08T22:56:03.819169image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2026-01-08T22:56:09.273900image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2026-01-08T22:56:15.539740image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2026-01-08T22:56:21.106838image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2026-01-08T22:56:27.512481image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2026-01-08T22:56:33.090271image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/

Correlations

2026-01-08T22:57:05.957960image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
DOLocationIDPULocationIDRatecodeIDVendorIDairport_feecongestion_surchargeextrafare_amountimprovement_surchargemta_taxpassenger_countpayment_typestore_and_fwd_flagtip_amounttolls_amounttotal_amounttrip_distance
DOLocationID1.0000.087-0.0450.0110.0780.147-0.007-0.1090.0130.026-0.0070.0310.009-0.016-0.056-0.097-0.108
PULocationID0.0871.000-0.1200.0250.3740.216-0.027-0.1370.0110.011-0.0150.0350.005-0.038-0.122-0.128-0.141
RatecodeID-0.045-0.1201.0000.1100.0210.229-0.0890.3490.013-0.2590.0500.0330.0040.1320.5310.3390.267
VendorID0.0110.0250.1101.0000.0430.0500.5820.0080.0580.0000.2340.0600.1020.0040.0070.0400.000
airport_fee0.0780.3740.0210.0431.0000.3360.4260.1110.2680.0000.0240.1360.0060.0170.0600.1660.000
congestion_surcharge0.1470.2160.2290.0500.3361.0000.2860.0770.6290.0000.0160.3660.0070.0290.1110.0990.000
extra-0.007-0.027-0.0890.5820.4260.2861.0000.0590.2270.139-0.0450.1470.0600.1110.1390.1480.098
fare_amount-0.109-0.1370.3490.0080.1110.0770.0591.0000.0590.0380.0400.0330.0000.4520.4250.9660.900
improvement_surcharge0.0130.0110.0130.0580.2680.6290.2270.0591.0000.0320.0110.2760.0230.0350.0700.0660.009
mta_tax0.0260.011-0.2590.0000.0000.0000.1390.0380.0321.000-0.0160.0050.0000.065-0.0560.0420.024
passenger_count-0.007-0.0150.0500.2340.0240.016-0.0450.0400.011-0.0161.0000.0310.0300.0070.0390.0380.037
payment_type0.0310.0350.0330.0600.1360.3660.1470.0330.2760.0050.0311.0000.0120.0310.0190.1030.005
store_and_fwd_flag0.0090.0050.0040.1020.0060.0070.0600.0000.0230.0000.0300.0121.0000.0000.0070.0040.000
tip_amount-0.016-0.0380.1320.0040.0170.0290.1110.4520.0350.0650.0070.0310.0001.0000.2490.5920.431
tolls_amount-0.056-0.1220.5310.0070.0600.1110.1390.4250.070-0.0560.0390.0190.0070.2491.0000.4360.407
total_amount-0.097-0.1280.3390.0400.1660.0990.1480.9660.0660.0420.0380.1030.0040.5920.4361.0000.876
trip_distance-0.108-0.1410.2670.0000.0000.0000.0980.9000.0090.0240.0370.0050.0000.4310.4070.8761.000

Missing values

2026-01-08T22:56:40.047409image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
A simple visualization of nullity by column.
2026-01-08T22:56:45.952814image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.
2026-01-08T22:56:56.352840image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
The correlation heatmap measures nullity correlation: how strongly the presence or absence of one variable affects the presence of another.

Sample

VendorIDtpep_pickup_datetimetpep_dropoff_datetimepassenger_counttrip_distanceRatecodeIDstore_and_fwd_flagPULocationIDDOLocationIDpayment_typefare_amountextramta_taxtip_amounttolls_amountimprovement_surchargetotal_amountcongestion_surchargeairport_fee
022023-01-01 00:32:102023-01-01 00:40:361.00.971.0N16114129.31.000.50.000.01.014.302.50.00
122023-01-01 00:55:082023-01-01 01:01:271.01.101.0N4323717.91.000.54.000.01.016.902.50.00
222023-01-01 00:25:042023-01-01 00:37:491.02.511.0N48238114.91.000.515.000.01.034.902.50.00
312023-01-01 00:03:482023-01-01 00:13:250.01.901.0N1387112.17.250.50.000.01.020.850.01.25
422023-01-01 00:10:292023-01-01 00:21:191.01.431.0N10779111.41.000.53.280.01.019.682.50.00
522023-01-01 00:50:342023-01-01 01:02:521.01.841.0N161137112.81.000.510.000.01.027.802.50.00
622023-01-01 00:09:222023-01-01 00:19:491.01.661.0N239143112.11.000.53.420.01.020.522.50.00
722023-01-01 00:27:122023-01-01 00:49:561.011.701.0N142200145.71.000.510.743.01.064.442.50.00
822023-01-01 00:21:442023-01-01 00:36:401.02.951.0N164236117.71.000.55.680.01.028.382.50.00
922023-01-01 00:39:422023-01-01 00:50:361.03.011.0N141107214.91.000.50.000.01.019.902.50.00
VendorIDtpep_pickup_datetimetpep_dropoff_datetimepassenger_counttrip_distanceRatecodeIDstore_and_fwd_flagPULocationIDDOLocationIDpayment_typefare_amountextramta_taxtip_amounttolls_amountimprovement_surchargetotal_amountcongestion_surchargeairport_fee
306675612023-01-31 23:05:362023-01-31 23:20:37NaN0.00NaNNone161148012.740.00.50.000.01.016.74NaNNaN
306675722023-01-31 23:08:542023-01-31 23:32:23NaN9.44NaNNone23183033.080.00.55.560.01.042.64NaNNaN
306675812023-01-31 23:10:562023-01-31 23:23:37NaN0.00NaNNone162151012.001.00.59.400.01.028.40NaNNaN
306675912023-01-31 23:54:022023-02-01 00:23:17NaN0.00NaNNone68160027.001.00.510.550.01.044.55NaNNaN
306676022023-01-31 23:30:202023-01-31 23:34:38NaN0.82NaNNone231144015.210.00.53.840.01.023.05NaNNaN
306676122023-01-31 23:58:342023-02-01 00:12:33NaN3.05NaNNone10748015.800.00.53.960.01.023.76NaNNaN
306676222023-01-31 23:31:092023-01-31 23:50:36NaN5.80NaNNone11275022.430.00.52.640.01.029.07NaNNaN
306676322023-01-31 23:01:052023-01-31 23:25:36NaN4.67NaNNone114239017.610.00.55.320.01.026.93NaNNaN
306676422023-01-31 23:40:002023-01-31 23:53:00NaN3.15NaNNone23079018.150.00.54.430.01.026.58NaNNaN
306676522023-01-31 23:07:322023-01-31 23:21:56NaN2.85NaNNone262143015.970.00.52.000.01.021.97NaNNaN